sentence order
OrderSum: Semantic Sentence Ordering for Extractive Summarization
The sentence-level framework defines extractive summarization as an individual sentence selection problem, determining whether each sentence in a document should be included in the summary. However, the sentence-level framework often produces summaries that contain only general sentences or repeat important but similar sentences (Narayan et al., 2018b; Zhong et al., 2020). The summary-level framework overcomes this limitation by defining extractive summarization as a summary ranking problem rather than a sentence selection problem. The main idea of the summary-level framework is to generate a set of candidate summaries consisting of different sentences, and then rank them to select the best summary. By considering sentence composition at the entire summary level rather than sentence by sentence, this approach enables each sentence in the summary to convey different, specific information (Narayan et al., 2018b; Zhong et al., 2020). Previous work in both frameworks has primarily focused on improving which sentences to include in the summary, or in other words, sentence inclusion. However, to the best of our knowledge, the importance of sentence order in summaries has not been highlighted since the era of graph-based extractive summarization (Mihalcea and Ta-rau, 2004; Erkan and Radev, 2004). The sentence order of a text plays a crucial role not only in readability but also in its meaning (Yin et al., 2019; Lo-geswaran et al., 2018). Table 1 illustrates how the arXiv:2502.16180v1
Sentence-Permuted Paragraph Generation
Yu, Wenhao, Zhu, Chenguang, Zhao, Tong, Guo, Zhichun, Jiang, Meng
Generating paragraphs of diverse contents is important in many applications. Existing generation models produce similar contents from homogenized contexts due to the fixed left-to-right sentence order. Our idea is permuting the sentence orders to improve the content diversity of multi-sentence paragraph. We propose a novel framework PermGen whose objective is to maximize the expected log-likelihood of output paragraph distributions with respect to all possible sentence orders. PermGen uses hierarchical positional embedding and designs new procedures for training, decoding, and candidate ranking in the sentence-permuted generation. Experiments on three paragraph generation benchmarks demonstrate PermGen generates more diverse outputs with a higher quality than existing models.